Ekaitz's tech blog:
I make stuff at ElenQ Technology and I talk about it

So we’ve got sidetracked…

From the series: Bootstrapping GCC in RISC-V

There are many software projects involved in our bootstrapping process, not only compilers! And many of them are not fully supported for RISC-V or we don’t have the compilers ready to build them. In order to be able to build everything, we need to touch the build scripts of many of them, add patches to them or fix our C standard library and compilers, as they are extremely minimal and may lack support.

During these days we had a really interesting case of back-and-forth that sidetracked us a little bit, so let me share it with you, and meanwhile I’ll introduce most of the work we’ve been doing since January.

Gash

In the bootstrapping in Guix we don’t rely on Bash to run our scripts. Instead we use Gash.

During the bootstrapping process in Guix, we found Gash hangs in some specific points of the process, mostly when configuring Binutils.

I managed to use Bash in Binutils in my RISC-V hardware instead, but in my x86_64 laptop I was unable to build all the dependencies I needed to build Bash for RISC-V. This is not cool.

Thankfully, Gash maintainers are friendly and we can talk with them to try to fix the issue.

Gzip

Gzip is also an integral part of the process, as we download the software releases in Gzip format. We need to decompress them as fast as possible (and correctly).

It built pretty easily, using the bootstrappable Tinycc but it didn’t run properly at first. It was able to decompress files but then, when it tried to compare the file checksum, it failed to do so.

This happened to be related with some missing integer support in our bootstrappable TinyCC. The riscv-mes branch of our bootstrappable TinyCC shows all the commits we needed to add to fix this issue and the rest of the issues that I share in this post.

The Gzip issue was fixed by this commit1:

589d2ab1 RISCV: 32bit val sign extension

It has some 32 bit value sign extension support we were missing. Without it, the binary operations that calculate the checksum in Gzip were simply wrong, as everything was incorrectly sign-extended2.

As you might imagine, due to the lack of proper debug symbols and the fact that the issue was so specific, this was really hard to deal with until we found out the real problem. Of course, this very issue would affect many other programs but this was the first time we saw it. That’s why it’s very important to fix things properly, as they may have many ramifications.

GNU-Make

One of the other dependencies is GNU-Make, needed in many projects. In the previous steps of the bootstrap we manage running commands manually, but in more complex projects GNU-Make is necessary.

In January we built Make using our bootstrappable TinyCC, but it didn’t work!

First it didn’t work, it segfaulted, when using long options (--help) but short ones (-h) did. This happened because our bootstrappable TinyCC had a missing piece from the backport I did. The commit db9d7b09 of the riscv-mes branch and the following two show a clear history of how this worked. We first realized the char * was loaded to a register using a lb instruction, which is a load byte. I realized this printing the value of the pointer in hex format, that shown only the lower bytes were the same, while the higher ones were empty. Then I disassembled and found the lb, that should have been an ld (load doubleword), the pointer size in a 64 bit machine that is.

The problem here was that the char * was detected in the compiler as an array of characters, which has a size of 1: a byte. The TinyCC I took the code from uses a function that calculates the size of the type, a pretty reasonable thing to have in a compiler. The problem we had was the type information was not stored properly and that function calculated the type based on that wrong information. My first attempt was to use a different function for that, but when I sent a patch to upstream TinyCC, thinking they also had this issue, they told me they didn’t have it in the first place3. That was more than surprising for me so I dig in the Git history until I realize they had a very interesting commit: Make casts lose top-level qualifiers. Ah-ha! The commit only has one line and then some tests. This is the content, without the tests:

+    vtop->type.t &= ~ ( VT_CONSTANT | VT_VOLATILE | VT_ARRAY );

This removed the VT_ARRAY flag from the pointer in the type, so the function that calculates the type treats the type as a pointer, 64 bit then in our case, so ld is emitted and we are happy. We cherry pick the commit from upstream, revert our fix and go on.

But of course that was not enough, that’d be too easy. We found some other issues in Make.

Now Make was running and not failing, but it said "No makefile was found" and it never run the recipes. We realized later there was some kind of issue when reading files and my colleague Andrius found the getdents system call returns a different struct in 64 bits, and we were reading it like the 32 bit structure so he fixed that in Meslibc for all 64 bit architectures. This error makes a lot of sense in Meslibc, because all the previous attempts in the bootstrapping were in 32 bits and our starting point only supports that. That’s one of the other sources of errors we have, we are also making this whole thing 64bit-ready.

Once Make was able to find and read the Makefile and run it, we realized other problem, this one related with the dates of the files. Make started to give us weird messages like "Timestamp out of range; substituting...". Later, I found that some recipes were executed even if the files it required didn’t change.

This is not a big deal if you just want things to be built once so we left this as not-very-important-thing until I used this Make in a Guix package. The gnu-build-system in Guix first runs ./configure (configure phase) and later runs make (build phase). This make rerun the ./configure command from the previous step, because it thought some of the files where changed between both phases. This behavior is more problematic than it feels, because Guix needs to fix the shebangs of all the scripts in the project4, and it has a phase for this between the ones I just mentioned: patch-generated-file-shebangs. If it’s the make run itself that configures the project and right after that starts building, the shebangs of the generated files are not fixed, and the process fails. The issue is not a not-very-important-thing anymore!

Of course, after what I just explained I was forced to fix this. Some debugging sessions later I found the stat system call’s result was not interpreted correctly in MeslibC. There were some padding issues, so I just fixed that in RISC-V and mostly fixed Make. Now Make, built using our bootstrappable TinyCC, works well enough for us.

TinyCC

In my talk this February in FOSDEM-2024 I explained upstream TinyCC was missing some RISC-V support and that we didn’t have it working yet. During this time we solved the main issue we had with it:

Unimplemented large addend for global address

I had no idea about how to fix this so I wrote an email to the person that wrote most of the code around the relocations and he answered me, giving me a very interesting answer. Thank you, Michael.

That answer was more than enough for me to write the code for it (it was almost done) and in a couple of hours I had a fix for this. The large addend support was pretty simple, actually. It was just that relocations are still a little bit scary for me, and the codebase doesn’t help a lot.

With this issue fixed, now we can go for upstream TinyCC and use it for later steps on the project, as we do in the bootstrapping chains in other architectures, as the upstream TinyCC is more stable and capable than our bootstrappable fork.

Binutils

We need to remember our goal is to build GCC. That’s why we try to use upstream TinyCC, as it is able to build it whereas our bootstrappable TinyCC might not be as.

Building GCC requires Binutils, so we tried to build it. We had several issues in Binutils and we haven’t managed to make Binutils’ programs that don’t explode. The problem here is probably because of limitations of our standard library, so here comes the sidetrack.

We considered using Musl instead, as it’s a powerful standard library that is also very simple.

Musl

Musl is really cool. We’ve used it a lot as a reference for MeslibC, but Musl is not used in Guix’s bootstrapping process in other architectures. Our plan is to try use it for Binutils to see if our broken binaries are because of MeslibC or because of something else.

Musl, as most C standard libraries, requires some support for assembly, and more specifically Extended Asm.

We already talked about Extended Asm5 support before but, in summary, it was unimplemented in TinyCC’s backend for RISC-V.

Apart from that, TinyCC lacks some very important pseudoinstructions that are used in Musl and the assembly syntax it uses is not the one that the GNU Assembler uses, so TinyCC is unable to build simple instructions like:

ld a0, 8(a0)

As TinyCC expects something like:

ld a0, a0, 8

Hmm…

Back to TinyCC

This is were the sidetrack went so wild we went back to almost the beginning. I wanted to make Musl build so I started to write support for everything I wanted it to do.

I implemented many pseudoinstructions and instructions that were missing and Musl needed. This includes GNU Assembler syntax for memory access instructions like loads and stores. By the way, don’t trust them blindly because I realized I did jal wrong (some relocation issue again!) and I had to fix it later.

I also added .options directive for the RISC-V assembly, that is used really often (I didn’t implement it yet). I did enough to make the builds pass. Most of the times the .options directive is used to disable the linker relaxation, which TinyCC doesn’t do anyway so… Why bother?

I also have a draft for the Extended Asm, and I have it kind of working. I am not sure about some of the things I did but I feel it’s pretty close.

The Extended Asm support is not upstreamed yet, but I sent it to the TinyCC mailing list. The rest of the things I sent it already to TinyCC and you can see in the mob branch.

MeslibC

Of course, I can’t stop, so I took of the all support I did for TinyCC and tried to apply it to the bootstrappable TinyCC.

I was also a little bit forced to do so because we rebuild MeslibC with TinyCC and after the changes we could not do it. When we started we had to make a copy of MeslibC that didn’t have the GNU As style assembly and supported the TinyCC style assembly instead. Mes’ Guix package as-is only provides one of the flavors of the MeslibC code, the TinyCC style one, which we can’t rebuild with the modern support in TinyCC.

My solution was to backport all the Extended Asm support and all the new assembler to the bootstrappable TinyCC and then remove the MeslibC copy that used the old syntax. I managed to make it build but the executables generated with it explode at the time of writing, so we need to review that further. In any case, this is a good change because it reduces the amount of code we have, and it uses the more recent TinyCC assembly, that had many improvements since I did the backport, a year ago.

So…

It looks we are back again at the very beginning, and near to the end at the same time, if you take in account what I shared in the latest post of the series about GCC.

We still need to work in some other related projects, like Patch, that would allow us to apply our bootstrapping patches, but that’s also almost working. I want to believe it’s not going to give us many headaches in the future.

In summary, it looks like sometimes you have to run and later go back to walk the same path, slowly this second time, with all the knowledge you got in the first run.

Here we are. Sidetracked, but also pretty happy, as this is still going forward.


  1. Link to GitHub 

  2. I already told you integers are hard 

  3. Yes, I should have checked better. My bad. 

  4. Guix doesn’t store binaries in the classic places. It does not follow the File Hierarchy Standard. It needs to replace the references to things like #!/bin/bash with something like #!/gnu/store/295aavfhzcn1vg9731zx9zw92msgby5a-bash-5.1.16/bin/bash 

  5. Extended Asm helps you call assembly blocks using C variables, and it also protects the variables you don’t want to touch. You can read more about that in GCC’s documentation